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Telecommunications apparatus and method 
for detecting any unusual activity in customer be- 
haviour. A comprehensive behavior profile of a cus- 
tomer is generated on the basis of customer trans- 
actions. The profile includes a short-term customer 
behavior obtained from all of the customer's trans- 
actions, and further includes a long-term customer 
behavior obtained on the basis of the generated 
short-term behavior. Any behavior deviation from 
the profile is detected and identified as fraudulent 
or unusual. 
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TELECOMMUNICATIONS SYSTEM FOR GENERATING 
A THREE-LEVEL CUSTOMER BEHAVIOR PROFILE 
AND FOR DETECTING DEVIATION FROM THE PROFILE TO IDENTIFY 

FRAUD 

5 

COPYRIGHT NOTICE 

A portion of the disclosure of this patent document contains material which is 
subject to copyright protection. The copyright owner has no objection to the facsimile 
10 reproduction by anyone of the patent document or the patent disclosure, as it appears in the 

Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights 
whatsoever. 

BACKGROUND OF THE INVENTION 

15 The present invention is related to telecommunications, and in particular, to a 

telecommunications system and method for generating a normal behavior profile of a customer 
and for determining a deviation from the generated profile to detect fraudulent activity. 

It is well known that the telecommunications industry regularly suffers major 
losses due to fraud The various types of fraud may be classified into two categories: 
20 subscription fraud and superimposed fraud. In subscription fraud, an account is obtained without 
any intention to pay the bill. In such cases, abnormal usage occurs throughout the active period 
of the account. The account is usually used for call selling or intensive self-usage, for example. 
The superimposed fraud is carried out when fraudsters "take over" a legitimate account The 
abnormal usage is superimposed upon the normal usage of a legitimate customer. Examples of 
25 such cases include cellular cloning, calling card theft, and cellular handset theft, to name a few. 
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To combat telecommunications fraud, various conventional techniques attempt to 

discover so-called "probably fraudulent" patterns based on historical data and then to detect the 
"probably fraudulent" patterns. The fraud detection system collects data representing the prior 
transactions by the calling party, by the user of credit or debit cards, etc. The collected data is 
then searched for the "probably fraudulent" patterns in user behavior. For example, if the 
person's international telephone calls continue for over 2 hours in a 24-hour time period, such 
activity would most likely constitute a fraudulent pattern. 

This conventional approach to fraud detection, however, is limited in several ways 
and has a number of disadvantages. First, fraud patterns are customer-dependent. Since each 
customer demonstrates an individual behavior, certain usage patterns may be suspicious for one 
customer, but are normal for another. Second, in order to construct a comprehensive fraud 
classification system, examples of all fraud patterns must be taken into account. The large 
number of possible fraud patterns and, the constant emergence of new ones make it impractical to 
create such a fraud classification system. Further, it is difficult to obtain training data that is 
properly classified as fraudulent and non-fraudulent. 

A need therefore exists to overcome the disadvantages of the above-noted fraud 
detection approaches, as well as other conventional approaches to fraud detection in the 
telecommunications industry. 

SUMMARY OF THE INVENTION 

It is an object of the present invention to generate a normal (ordinary) behavior 
profile of a customer. 
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It is another object of the present invention to detect a deviation from the 
generated normal behavior profile. 

It is yet another object of the present invention to identify any unusual activity on 
behalf of the customer. 

These and other objects, features and advantages are accomplished by a computer 
implemented method and apparatus for determining a normal customer behavior profile that 
includes a plurality of transactions pertaining to an activity. The normal behavior profile is used 
to alert of an unusual activity. According to the present invention, a number of prototypical 
transactions is selected from a plurality of transactions. The extracted prototypical transactions 
collected during a first predetermined time interval are arranged into a first behavior profile. A 
plurality of first behavior profiles is obtained during a second predetermined time interval that is 
comprised of a plurality of first predetermined time intervals. The first behavior profiles obtained 
during the second predetermined time interval are arranged into a number of clusters. A 
prototypical first behavior profile is determined for each cluster, and the determined prototypical 
first behavior profiles are arranged into a plurality of records for representing a second behavior 
profile. 

In accordance with one aspect of the present invention, each prototypical first 
behavior profile is located at a respective center of each cluster. 

In accordance with another aspect of the present invention, each transaction is 
defined by at least one attribute which is represented either non-numerically or numerically. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

The following detailed description is read in conjunction with the accompanying 
drawings, in which: 

Fig. 1 A is a system block diagram of the present invention in accordance with one 
5 embodiment thereof; 

Fig. IB shows a first level profile and is a block diagram of a call detail record 
(CDR) containing various fields which represent attributes of a call; 

Fig. 2A is a flowchart for prototyping the first level profile according to the 
present invention; 

10 Fig. 2B graphically illustrates the operation of prototyping the first level profile of 

Fig. 2A; 

Fig. 3 is a block diagram of a second level profile as the result of prototyping the 
first level profile; 

Fig. 4A is a flowchart for extracting daily prototypes according to the present 

15 invention; 

Fig. 4B is a flowchart for a clustering operation according to the present 

invention; 

Fig. 4C graphically illustrates the operation of extracting daily prototypes from 
the second level profile of Figs. 4 A and 4C; 
2 o Fig. 5 is a block diagram of a daily prototype record as the result of extracting the 

daily prototypes; 

Fig. 6 is a block diagram of a third level profile containing daily prototype 

records; and 



4 



WO 00/64193 PCT/ILOO/00232 
Fig. 7 is a flowchart for detecting a deviation from the normal behavior profile. 



DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

As a general overview, the present invention detects fraud in telecommunications 
by determining any significant deviation from the customer's normal (ordinary) behavior. 
Unlike the conventional methods that detect "probably fraudulent" patterns, the present invention 
constructs a comprehensive multi-level model of the normal behavior and detects any deviation 
from the normal behavior. The inventive model of the normal behavior captures a variety of 
customer's behaviors on the basis of multiple transaction attributes. Three hierarchical profile 
levels are generated, and each profile level is generated on the basis of the preceding (lower) 
profile level. Once the normal behavior model is established, a new instance of customer's 
behavior is compared to the normal behavior model; and a significant deviation is alerted as 
fraudulent. In the present invention the term "deviation" includes a dissimilarity between any 
two instances of behavior representation. 

In conjunction with the figures, a telephone system will now be described in detail 
as one representative embodiment of the present invention. As shown in a system block diagram 
of Fig. 1A, a calling party 14 places a telephone call to a called party 12 via a network 10. Also 
connected to the network 10 is a programmable controller 16 which may be embodied as a 
general-purpose computer programmed to perform the inventive operations as described 
hereinbelow and illustrated in the figures. Alternatively, the programmable controller 16 may be 
a specific computer programmed to execute those inventive operations. 

Fig. IB shows a block diagram of a call detail record (CDR) 100 containing 
various fields which represent attributes of the telephone call placed between the calling and 
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called parties of Fig. I A. The programmable controller 16, which may be located in a system of 

a local or long distance carrier, generates the CDR 100 offline for every telephone call. The 

CDR 100 is generated after the completion of the call. As shown in Fig. IB, some representative 

fields of the CDR 100 for the landline telephone call that are relevant to a customer behavior are 

duration, start time, destination type, call type. It is understood, of course, that additional fields 

(attributes) may be included depending on the call. For example, an attribute specifying the 

origination location of the call is important to the providers of wireless (cellular) services; while 

long-distance carriers may require an attribute that specifies countries and/or continents, for 

example. 

Referring to Fig. IB, the duration and start time of the telephone call take on 
numerical values. The destination type may be classified as Local, International, Premium Rate 
Services, Toll-free, and the call type is either voice or data. 

Pertaining to one call, the CDR 100 as generated by the programmable controller 
16 represents the first level profile (a so-called call profile), as indicated in Fig. IB. Call 
prototypes are extracted from the first level profile in accordance with Fig. 2A. In accordance 
with the present invention, a set of prototypes is obtained by prototyping the call profile in such a 
way that any new call has a prototype either substantially similar or substantially dissimilar to 
any member in the set of the prototypes. 

The results of prototyping the call profile are illustrated in Fig. 2B. In particular, 
the above-described attributes of the call (the CDR fields) that are defined by numerical values 
are divided into a finite and preferably small number of values (step 200 of Fig. 2 A). These 
attributes are the duration and start time in the example of Fig. IB. The values for the duration 
of the telephone call are theoretically infinite, ranging from 0 seconds to N seconds; and 
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similarly the start time contains numerous entries, defined by 00-hr, 00-min, 00-sec to 23-hr, 59- 

min, 59-sec. In accordance with one aspect of the present invention, the call duration is divided 

into a predetermined finite number of values. For example, 1 2 five-minute intervals cover any 

call up to an hour long. Since a predominant majority of the calls are shorter than one hour, such 

12 values representing the duration of the call do not introduce any significant error into the 

analysis. 

Similar to the duration time, the start time of the call is partitioned into a finite 
number of values (step 200 of Fig. 2A). For example in accordance with one embodiment, 12 
two-hour time windows adequately represent the start time attribute of the telephone call. While 
the two attributes of duration snd start time both are divided into 12 in this example, the identical 
partitioning is not a requirement of the present invention. The numerical attributes can be 
represented by any number of discreet values that are independent of each other. It will be 
appreciated that the processing speed and storage requirements of the system are not directly 
affected by the number of values resulting from the partitioning operation, because a vector 
contains, on average, 3 non-zero entries. The entries are stored in memory in avariable-length 
vector. 

Following the above-described division of each numerical attribute into a 
predetermined finite number, the operation of prototyping the call profile is completed as 
follows: the combination of all generated values of numerical attributes is generated as shown in 
step 202 in Fig. 2A, and the generated combinations are associated with the non-numerical 
attributes of the telephone call (step 204 in Fig. 2A). In particular, referring to the above 
example of 12 values each for the call duration and start time, there are 144 combinations or call 
prototypes, so-called prototypical calls. Since the destination type attribute is non-numerical, 
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144 prototypes are allocated to each of the four values of this attribute (Local, International, 
Premium Rate Services, Toll-free) resulting in 756 call prototypes. Further, there are 2 values of 
non-numerical call type (Voice, Data). As the result, the total number of call prototypes is 1 152. 
Fig. 2B graphically illustrates the above-described operation of call profile 
5 prototyping to facilitate the understanding thereof. In the X-Y coordinate plane, the call duration 
values are plotted along the Y-axis, and the start time values are plotted along the X-axis labeled 
"Time of Day". Each depicted square, located at the intersection of X-Y coordinates in Fig. 2B, 
represents a call prototype (a prototypical call), such as a first Call Prototype 250, a second Call 
Prototype 252, an N-th Call Prototype 254, etc. Further shown in Fig. 2B are four X-Y planes 
10 for each instance of the destination type attribute; and the sub-space containing the four X-Y 
planes is provided for each of the voice and data call types. 

Following the first level profile processing referred to as the call profile 
prototyping and described above, the second level profile is created on the basis of the extracted 
call prototypes- The second level profile is a so-called daily profile and represents the short-term * 
15 behavior of a customer. As illustrated in Fig. 3, a daily profile 300 for a particular customer 
includes two parts: a quantitative profile 302 and a qualitative profile 304. The quantitative 
profile 302 indicates the usage volume, that is, the number of calls made in one day by the 
customer. The qualitative profile 304 denotes the nature of calls placed by the customer during 
the day. On the basis of the above-mentioned representative attributes of the telephone call, the 
2 0 qualitative profile 304 specifies for a particular day how long the calls lasted, to where and at 
what time they were placed, etc. 

In particular, the qualitative profile 304 is a vector containing an entry for each 
call prototype extracted from the call profile (the first level profile). The entry represents how 
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many calls, in terms of a percentage of the total daily calls, represented by a particular call 
prototype were made by the customer on a given day, as shown in Fig. 3. For example, 20% of 
all daily calls were placed by the customer between 4 p.m. and 6 a.m. on December 31, 1998, 
lasted between 5 to 10 minutes, and were international calls. Referring to the above-mentioned 
example, 1,152 call prototypes are determined; and each of those call prototypes is associated 
with a value that indicates a percentage of all calls made during the 24-hour period, for example. 
Due to the diverse attributes of telephone calls, the qualitative profile 304, as defined by the 
vector shown in Fig. 3, represents a multi-dimensional probability distribution of calls made on a 
given day. 

As illustrated in Fig. 4C, a daily profiles space 400 contains all possible daily 
profiles. The number of all possible daily profiles is infinite, and consequently is umanageable 
due to computational and storage constraints. Therefore, it is needed to represent the daily 
profiles space 400 by a finite number of daily profiles. For this purpose, the daily profiles 300 
are sampled over a preselected period of time, for example 60 days. 

A clustering operation is then applied to the sampled set of daily profiles, and 
daily prototypes (so-called prototypical days) are determined on the basis of the clustering 
operation. The operation of extracting daily prototypes from the daily profiles 300 (the second 
level profile) is graphically shown in Fig. 4C to facilitate the explanation of this operation. The 
clustering operation arranges the daily profiles 300-1, 300-2, 300-3, ... 300-N into several 
clusters 402, 404, 406 as shown in Fig. 4C (step 410 in Fig. 4 A). The clustering operation of 
step 410 in Fig. 4 A includes the following steps: 

1 . In the space of daily profiles, randomly selecting a number of daily profiles for 
designation as cluster centers (step 416 in Fig. 4B). 
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2. Assigning each daily profile to its closest cluster center (step 418 in Fig. 4B). 

3. Recalculating each cluster center as the Euclidean centroid of daily profiles 
assigned to the corresponding cluster center (step 420 in Fig. 4B). 

4. Repeating steps 1 through 3 until no daily profile changes membership among 
the clusters (step 422 in Fig. 4B). 

The clustering operation is aimed at minimizing the sum of squared distances between daily 
profiles and their associated cluster center. The iterative procedure of recalculating the new 
cluster center as the Euclidean centroid of the locally assigned daily profiles accomplishes the 
minimization operation. 

To calculate those distances, a Cumulative Distribution-based (CD) distance 
function based on cumulative distribution is proposed as follows. In the one-dimensional case, 
the distance between two continuous probability distributions is defined as follows: 

Let f,(x) and f 2 (x) be two probability distributions of a random variable X. The 
distance between f,(x) and f 2 (x), denoted as d(f,, f 2 ) is: 

[\Xmax Xminjjrnun J 
Where V(, > Xmm . , < x Jj V(/density function on X) f(x) = 0 

Xmm 

F(x) is the cumulative distribution corresponding to density function f(x). 
and are the maximum and minimum values respectively that the random variable X can 
reach. X^ and X roin are defined such that the probability of values outside (X^, X^ is 
insignificant. X^ and X^ do not depend on the two density functions being compared, but 
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solely on the random variable X. The denominator is used for normalization, so 0<d(f,, f 2 )<l is 
obtained. 

In the discrete, ordinal case, the domain of the random variable X is D = {x„ x 2 , 
x„}, x^ > x 3 <=> i > j, and the distance between two probability distributions is 



where 5j = x^, - Xj 

In the discrete case, the cumulative distribution is a step-function, and the formula 
is a simplification of the general formula for the subset of discrete probability distribution, x, 
and x,,, the smallest and largest discrete values, respectively, serve as and X max . 5j are the 
differences between consecutive discrete values. It is noted that 6j are not necessarily equal, that 
is, the discrete values do not have to increase by a fixed value. Therefore, to improve 
computation efficiency, only Xj is considered such that P^) 0 for (i=l ,2). In that case, 6j 
would be the difference between two consecutive non-zero Xj 4 s. Binary random variables are 
treated as discrete ordinal random variables, with the domain {0,1 } . In the discrete, non-ordinal 
case, such as the call type for example, the domain consists of non-numeric values. It is assumed 
that the similarity between each pair of values is equal. In this case, the attribute X* is replaced 
with n^ binary attributes, where n^ is the number of values of attribute L 




In the multi-dimensional discrete case: 

f r 



(3) „ {Pl , P2)= £ 



^(F,(x^)-F 2 (^)) 2 ^ 



J) 



where 




w^s a weight for attribute i of the call profile. 
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An update of clusters is carried out in the following manner. The clustering 

operation performs the above-described steps 1-4. If a daily profile is not close enough to any of 
the existing cluster centers, new cluster and cluster center are formed. Hence, the clustering 
operation dynamically adds new clusters) if a daily profile is beyond a predetermined distance 
from the existing cluster centers. 

Following the clustering operation as described above, a daily prototype (a 
prototypical day) is calculated for each cluster (step 412 in Fig. 4A). In particular, following the 
completion of the iterative clustering operations as described above, the daily profiles designated 
as cluster centers become daily prototypes (step 414 in Fig. 4A). As the result, each cluster is 
defined by a daily prototype that represents all of the daily profiles associated with a respective 
cluster. 

After the second level profile processing referred to as the daily profile 
prototyping and described above, the third level profile is generated on the basis of the extracted 
daily prototypes. The third level profile is the overall profile of the customer and represents 
his/her long-term so-called "normal" behavior. There is an entry in the overall profile vector for 
each daily prototype extracted from the second level profile (daily profile). As shown in Fig. 5, 
each entry is a record 500 containing 3 fields. The first field 502 indicates the number of days 
that a daily prototype has been observed during the preselected period of time. The second field 
504 shows the total number of telephone calls made during those days. That is, the number of 
calls made on each day, as indicated in the first field 502, are summed. The third field 506 is the 
sum of squared quantitative profiles. That is, the number of calls made on each day as indicated 
in the first field 502 is squared, and then the products are summed. 
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Fig. 6 shows the long-term or overall profile 600 of the customer calling behavior, 

designated as the third level profile, containing entries for all daily prototypes, such as a first 

Daily Prototype (DP type) 600, a second Daily Prototype (DP type) 602, a third Daily Prototype 

(DP type) 604, etc. As shown in Fig. 6, the customer behaviors on weekdays and weekends are 

represented separately to better approximate the customer overall calling behavior. The third 

level profiles are constantly updated with newly introduced normal second level profiles. This is 

done to capture changes in the customer's normal behavior. 

After generating the normal behavior model comprising the three levels as 

described above, a deviation from the normal calling behavior is detected as shown in Fig. 7. As 

stated above, the daily profile includes two parts: the quantitative profile and the qualitative 

profile. Matching a new daily profile to the third level (overall) profile is executed in two stages 

that involve both quantitative and qualitative profiles. First, the qualitative profile is checked, 

and if there is a "qualitative match", the quantitative profile is then examined. 

In particular, the qualitative profile is obtained (step 700), and a match with the 

third level profile is found if the qualitative profile is substantially close to the nearest non-zero 

daily prototype of the customer (step 702). That is, the qualitative profile matches the third level 

profile when the distance between the qualitative profile and the nearest non-zero daily prototype 

does not exceed a predetermined threshold value, as determined in step 704. The above 

comparing operation is expressed as follows: 

(3) Diff<T, where 

Diff = a difference between qualitative profiles (measures) of a daily prototype 
and of the daily profile under examination as determined on the basis of the CD-distance 
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function; 

T = a predetermined threshold value. 

If the distance does not exceed the threshold (step 706), the second stage of 
analyzing the quantitative profile is carried out. In this stage, the quantitative profile is obtained 
(step 706), and is compared with the third level profile on the basis of the following (step 708): a 
mean value of the daily prototype closest to the examined daily profile, a standard deviation of 
the daily prototype closest to the examined daily profile, and a predetermined threshold value. 
Expressed as an equation, the above comparing operation is as follows: 

(4) (DP - M)/a < T, where 

DP = the quantitative daily profile under examination; 

M = average number of calls in days of the daily prototype closest to the daily 

profile DP; 

a = the standard deviation of the number of calls in days of the daily prototype 
closest to the daily profile DP; 

T = a predetermined threshold value (measured in terms of standard deviation). 

If the predetermined threshold value is not exceeded as determined in step 710, 
then the daily profile under examination is not identified as fraudulent or unusual. However, if 
any of the above equations (3) or (4) do not hold true, then the daily profile is marked as 
fraudulent or unusual, and/or an alert is initiated (step 712). According to this aspect of the 
present invention, the deviation from the normal calling behavior as described above represents a 
genuine dissimilarity between any two instances of the second level profile. The genuine 
dissimilarity is assured by the application of the CD-distance function. 



14 



WO 00/64193 PCT/ILOO/00232 
It is understood, of course, that the present invention described above with 

reference to the telephone call activities is not limited to such embodiment only. The present 
invention is equally applicable to other user transactions and activities by identifying the short- 
term and long-term user behavior using the three-level profile and by detecting a deviation from 
the behavior profile. Such activities include credit card or debit card usage, purchases of goods 
and/or services on-line over the Internet or in-person, and any other multiple transactions by a 
customer/user over a preselected period of time. 

In addition, the present invention may be used for detecting changes in customer 
behavior for other purposes that may not be related to fraud detection. Such purposes may 
include identifying marketing opportunities and offering new incentives or price plans to a 
customer following a behavioral change. 

Furthermore, the above-described operations of identifying changes in the 
behavior may be applied, in addition to an individual customer, to any selected entity, such as a 
group of people, segments of population, countries, etc. 

In addition, in the above exemplary embodiment, the short-term behavior (second 
level profile) includes any calling activity during the 24-hour time interval. Similarly, the long- 
term behavior (third level profile) is representatively set to a 60-day interval. The present 
invention is not limited to such definitions of the short-term and long-term behaviors which can 
be defined for any time interval selected on the basis of the frequency of user transactions. 

It is understood by those skilled in the art that the term telecommunications" as 
used herein is defined as the transmission and reception of signals, either electrical or optical, via 
any medium, such as a wired or wireless conduit. 
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Having described specific preferred embodiments of the invention with reference 
to the accompanying drawings, it is to be understood that the invention is not limited to those 
precise embodiments, and that various changes and modifications may be effected therein by one 
skilled in the art without departing from the scope or the spirit of the invention as defined in the 
appended claims. 
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CLAIMS 

What is claimed is: 

1 . A computer implemented method for determining a normal customer behavior profile that is 
based on at least one transaction pertaining to an activity, said typical customer behavior profile 
being used to alert of an unusual activity, said method comprising the steps of: 

arranging a plurality of first behavior profiles obtained during a first 
predetermined time interval into a number of clusters, each first behavior profile representing the 
transactions collected during a second predetermined time interval that is shorter than said first 
predetermined time interval; 

determining a prototypical first behavior profile for each cluster; and 
arranging the determined prototypical first behavior profiles into a plurality of 
records for representing a second behavior profile. 

2. A computer implemented method for detecting a deviation from a normal customer behavior 
profile that is based on at least one transaction pertaining to an activity, said method comprising 
the steps of: 

arranging a plurality of first behavior profiles obtained during a first 
predetermined time interval into a number of clusters, each first behavior profile representing the 
transactions collected during a second predetermined time interval that is shorter than said first 
predetermined time interval; 

determining a prototypical first behavior profile for each cluster; 

arranging the determined prototypical first behavior profiles into a plurality of 
records for representing a second behavior profile; and 
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comparing a new first behavior profile with each record to identify said deviation 

from said typical customer behavior profile if none of the records substantially matches said new 
first behavior profile. 

3. A computer implemented method for determining a normal customer behavior profile that 
includes a plurality of transactions pertaining to an activity, said normal behavior profile being 
used to alert of an unusual activity, said method comprising the steps of: 

selecting a number of prototypical transactions from said plurality of transactions; 

arranging the selected prototypical transactions collected during a first 
predetermined time interval into a first behavior profile; 

obtaining a plurality of first behavior profiles during a second predetermined time 
interval that is comprised of a plurality of first predetermined time intervals; 

arranging the first behavior profiles obtained during said second predetermined 
time interval into a number of clusters; 

determining a prototypical first behavior profile for each cluster; and 

arranging the determined prototypical first behavior profiles into a plurality of 
records for representing a second behavior profile. 

4. The method according to claim 3, wherein each transaction is defined by non-numerical and 
numerical attributes which are qualitative and quantitative measures of the transaction, 
respectively. 
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5. The method according to claim 4, wherein said step of selecting the prototypical transactions 

includes dividing the numerical attributes into a predetermined number of values and associating 
the values with the non-numerical attributes. 

6. The method according to claim 3, wherein said steps of arranging the first behavior profiles 
and determining said prototypical first behavior profile comprise: 

designating a number of first behavior profiles as respective centers for the 

clusters; 

finding a closest center to each remaining first behavior profile; 
assigning said each remaining first behavior profile to said closest center; 
recalculating each center for a corresponding cluster according to said step of 

assigning; 

repeating said steps of designating, finding, assigning and recalculating until a 
membership of each cluster does not change. 

7. The method according to claim 6, wherein said step of finding includes the step of weighted 
summing the squared distances between the cumulative distributions at each attribute value. 

8. The method according to claim 3, wherein each prototypical first behavior profile is located at 
a respective center of each cluster. 
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9. The method according to claim 3, wherein said first behavior profile includes a vector 

representing a multi-dimensional probability distribution of the transactions carried out during 
said first predetermined time interval. 

10. The method according to claim 3, wherein said first behavior profile includes the number of 
transactions. 

11. The method according to claim 10, wherein each record includes the number of times a 
respective prototypical first behavior profile has been observed during said second predetermined 
time interval, also includes a total number of transactions carried out during said second 
predetermined time interval, and Anther includes a sum of squared number of transactions during 
said first predetermined time interval within said second predetermined time interval. 

12. The method according to claim 3, wherein said second behavior profile is separately 
maintained for each customer for each type of said first predetermined time interval. 

13. A computer implemented method for detecting a deviation from a normal customer behavior 
profile that includes a plurality of transactions, said method comprising the steps of: 

generating said normal behavior profile by: 

selecting a number of prototypical transactions from said plurality 

of transactions; 

arranging the selected prototypical transactions collected during a 
first predetermined time interval into a first behavior profile; 
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obtaining a plurality of first behavior profiles during a second 

predetermined time interval that is comprised of a plurality of first predetermined time intervals; 

arranging the first behavior profiles obtained during said second 
predetermined time interval into a number of clusters; 

determining a prototypical first behavior profile for each cluster; 

and 

arranging the determined prototypical first behavior profiles into a 
plurality of records for representing a second behavior profile determined over said second 
predetermined time interval; and 

comparing a new first behavior profile with each record to identify said deviation 
from said normal behavior profile if none of the records substantially matches said new first 
behavior profile. 

14. The method according to claim 13, wherein said deviation is designated as fraudulent on 
behalf of a customer. 

15. The method according to claim 13, wherein said first behavior profile includes a vector 
representing a multi-dimensional probability distribution of the transactions carried out during 
said first predetermined time interval. 

16. The method according to claim 13, wherein said first behavior profile includes the number of 
transactions. 
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1 7. The method according to claim 16, wherein each record includes the number of times a 
respective prototypical First behavior profile has been observed during said second predetermined 
time interval, also includes a total number of transactions carried out during said second 
predetermined time interval, and further includes a sum of squared number of transactions during 
said first predetermined time interval within said second predetermined time interval. 

18. The method according to claim 13, wherein said second behavior profile is separately 
maintained for each customer for each type of said first predetermined time interval. 

19. The method according to claim 13, wherein said step of comparing includes determining 
whether a difference between said qualitative measure of said new first behavior profile and of 
said prototypical first behavior profile is less than a first predetermined threshold. 

20. The method according to claim 19, wherein said step of comparing further includes 
comparing an expression with a second predetermined threshold if said first predetermined 
threshold is not exceeded, said expression being a function of said quantitative measure, of a 
mean of the first behavior profiles closest to said prototypical first behavior profile, and of a 
standard deviation of the first behavior profiles closest to said prototypical first behavior profile. 

21. The method according to claim 19, wherein said step of determining said distance includes 
the step of weighted summing the squared distances between the cumulative distributions at each 
attribute value. 
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22. Apparatus comprising a programmable controller for determining a normal customer 

behavior profile that includes a plurality of transactions pertaining to an activity, said normal 
behavior profile being used to alert of an unusual activity, said apparatus comprising: 

means for selecting a number of prototypical transactions from said plurality of 

transactions; 

means for arranging the selected prototypical transactions collected during a first 
predetermined time interval into a first behavior profile; 

means for obtaining a plurality of first behavior profiles during a second 
predetermined time interval that is comprised of a plurality of first predetermined time intervals; 

means for arranging the first behavior profiles obtained during said second 
predetermined time interval into a number of clusters; 

means for determining a prototypical first behavior profile for each cluster; and 

means for arranging the determined prototypical first behavior profiles into a 
plurality of records for representing a second behavior profile determined over said second 
predetermined time interval. 

23. The apparatus according to claim 22, wherein each transaction is defined by non-numerical 
and numerical attributes which are qualitative and quantitative measures of the transaction, 
respectively. 

24. The apparatus according to claim 22, wherein said means for selecting includes means for 
dividing said one numerical attribute into a predetermined number of values and for associating 
the values with the non-numerical attributes. 
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25. The apparatus according to claim 22, wherein said means for arranging the first behavior 

profiles and means for determining said prototypical first behavior profile comprise: 

means for designating a number of first behavior profiles as respective centers for 

the clusters; 

means for finding a closest center to each remaining first behavior profile; 
means for assigning said each remaining first behavior profile to said closest 

center; 

means for recalculating each center for a corresponding cluster according to said 
step of assigning; and 

means for repeating the operations of designating, finding, assigning and 
recalculating until a membership of each cluster does not change. 

26. The apparatus according to claim 25, wherein said means for finding includes means for 
weighted summing the squared distances between the cumulative distributions at each attribute 
value. 

27. The apparatus according to claim 22, wherein each prototypical first behavior profile is 
located at a respective center of each cluster. 

28. The apparatus according to claim 22, wherein said first behavior profile includes a vector 
representing a multi-dimensional probability distribution of the transactions carried out during 
said first predetermined time interval. 
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29. The apparatus according to claim 22, wherein said first behavior profile includes the number 
of transactions. 

30. The apparatus according to claim 29, wherein each record includes the number of times a 
respective prototypical first behavior profile has been observed during said second predetermined 
time interval, also includes a total number of transactions carried out during said second 
predetermined time interval, and further includes a sum of squared number of transactions during 
said first predetermined time interval within said second predetermined time interval. 

31. The apparatus according to claim 22, wherein said second behavior profile is separately 
maintained for each customer for each type of said first predetermined time interval. 

32. Apparatus comprising a programmable controller for detecting a deviation from a normal 
customer behavior profile that includes a plurality of transactions, said apparatus comprising: 

means for generating said normal behavior profile comprising: 

means for selecting a number of prototypical transactions from said 

plurality of transactions; 

means for arranging the selected prototypical transactions collected 
during a first predetermined time interval into a first behavior profile; 

means for obtaining a plurality of first behavior profiles during a 
second predetermined time interval that is comprised of a plurality of first predetermined time 
intervals; 
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means for arranging the first behavior profiles obtained during said 

second predetermined time interval into a number of clusters; and 

means for determining a prototypical first behavior profile for each 

cluster; and 

means for arranging the determined prototypical first behavior profiles into a 
plurality of records for representing a second behavior profile determined over said second 
predetermined time interval, said apparatus further comprising means for comparing a new first 
behavior profile with each record to identify said deviation from said normal behavior profile if 
none of the records substantially matches said new first behavior profile. 

33. The apparatus according to claim 32, wherein said deviation is designated as fraudulent on 
behalf of a customer. 

34. The apparatus according to claim 32, wherein said first behavior profile includes a vector 
representing a multi-dimensional probability distribution of the transactions carried out during 
said first predetermined time interval. 

35. The apparatus according to claim 32, wherein said first behavior profile includes the number 
of transactions. 

36. The apparatus according to claim 35, wherein each record includes the number of times a 
respective prototypical first behavior profile has been observed during said second predetermined 
time interval, also includes a total number of transactions earned out during said second 
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predetermined time interval, and further includes a sum of squared number of transactions during 

said first predetermined time interval within said second predetermined time interval. 

37. The apparatus according to claim 32, wherein said second behavior profile is separately 
maintained for each customer for each type of said first predetermined time interval. 

38. The apparatus according to claim 32, wherein said means for comparing includes 
determining whether a difference between said qualitative measure of said new first behavior 
profile and of said prototypical first behavior profile is less than a first predetermined threshold. 

39. The apparatus according to claim 32, wherein said means for comparing further includes 
comparing an expression with a second predetermined threshold if said first predetermined 
threshold is not exceeded, said expression being a function of said quantitative measure, of a 
mean of the first behavior profiles closest to said prototypical first behavior profile, and of a 
standard deviation of the first behavior profiles closest to said prototypical first behavior profile. 

40. The apparatus according to claim 38, wherein said means for comparing includes means for 
weighted summing the squared distances between the cumulative distributions at each attribute 
value. 
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